Transparent software emulation as an alternative to hardware bus lock

ABSTRACT

One of the disclosed embodiments comprises a method for software emulation of hardware bus lock in a computer system, comprising: acquiring a bus lock semaphore; providing the bus lock semaphore to a device attempting a bus lock operation; acquiring a page table semaphore; invalidating page table entries to prevent access to a location in the computer system referenced by the bus lock operation, wherein the location is an input/output (I/O) device; purging translation lookaside buffer pages associated with the location; and executing a clone instruction without lock semantics.

CROSS REFERENCE TO RELATED APPLICATION(S)

This is a Continuation application of U.S. Ser. No. 10/430,395, filedMay 7, 2003, which is a Divisional application of U.S. Ser. No.09/504,023, filed Feb. 18, 2000, the disclosures of both applicationswhich are hereby incorporated by reference in their entireties.

TECHNICAL FIELD

The technical field is computer architecture that use mechanisms toprevent conflicting access to shared computer resources.

BACKGROUND

Current computer systems use various means to ensure temporaryexclusivity of access of a central processing unit (CPU) or other activesystem agent to a memory data item or input/output device. One suchmeans is a bus lock. A bus lock is a hardware mechanism in which theagent programmed for atomic, exclusive access to the memory data item,the input/output device, or a combination of the two, signals itsrequirement for exclusive access using a signal transmitted on to asystem bus or interconnect. Other system agents are then prevented fromaccessing the locked item or items during the interval that the lockingagent signals exclusive access.

This hardware bus locking mechanism presents serious performance issues,with a disproportionately larger impact on larger systems. This isbecause it is generally prohibitively complex to restrict the scope ofthe lock, using hardware mechanisms, to the particular items beingaccessed. For example, large portions of a system, or even the entiresystem, may be inaccessible to other agents during the lock, causingsubstantial stalls.

Even without trying to narrow the scope of the lock, the implementationof hardware bus lock in large systems is complex. The complexity arisesfrom having to propagate the lock indication through the system, overperhaps many busses or other interconnects, while managing conflicting,simultaneous lock attempts so as to assure forward progress and dataintegrity. For example, current computer systems may be implementedusing several busses, all running in parallel. In such a system, two ormore lock attempts may occur simultaneously, and some type ofarbitration mechanism would be required to determine which active agentacquires the bus lock. Otherwise, a deadlock situation could arise, andsystem processing could be halted.

One solution to this problem is to implement the computer systemhardware so that no matter how large the computer system, a system-widebus lock is available. However, this solution is impractical because oneprocess running on one processor in the computer system can cause asystem-wide stall during the time the bus lock is being serviced.

Another solution to the above design performance problems is toadminister the exclusivity using cacheable semaphores. However, thischoice is not directly available in the case of a computer systemrequired to be backward compatible with a legacy architecture that makesthe bus lock feature available in a visible way to software.

SUMMARY

A software emulation module provides the functions of a hardware buslock without the attendant disadvantages of the hardware bus lock. Codesequences that would ordinarily trigger a bus lock signal are used tocause a fault. A fault handler acquires a cacheable semaphore that isreserved for bus lock emulation purposes. The fault handler alsoacquires a semaphore that is used to ensure exclusive access to nativepage tables or equivalent address space protection mechanisms. Thesoftware emulation module then causes invalidation of relevant pagetable entries, purges translation lookaside buffer pages of the lockingaccesses, and sets a mode bit that defeats any fault-on-lock-attemptbehavior. The emulation module then locally inserts any neededtranslation/protection entries, executes the locking sequence and thenclears the mode bit. Finally, the emulation module causes the semaphoresto be released and the normal flow of execution returns.

Using the software emulation module, the portion of memory space that islocked may be reduced, using paging or a similar mechanism, and othersystem agents are only affected if the agents attempt to access thelocked portion of the memory space during the interval in which the buslock signal is asserted.

DESCRIPTION OF THE DRAWINGS

The detailed description will refer to the following drawings in whichlike numerals refer to like objects, and in which:

FIG. 1 is a block diagram of an computer architecture that uses softwareemulation as an alternative to hardware bus lock;

FIG. 2 is a block diagram of the fault handler used with the computerarchitecture of FIG. 1;

FIG. 3 illustrates a logical device used with the fault handler of FIG.2; and

FIGS. 4-6 are is flowcharts illustrating the processes of softwareemulation.

DETAILED DESCRIPTION

An alternative to a hardware bus lock mechanism provides exclusiveaccess to a data item in a memory of a computer system, a device in thecomputer system, or a set of the item and the device. A softwareemulation module provides the functionality of exclusive access to suchshared computer system resources without the drawbacks inherent incurrent computer systems. The software emulation module causes signalsthat could cause a bus lock to instead cause a fault. A fault handlerthen causes execution of a series of steps that provide exclusive accessto a desired item.

FIG. 1 shows a computer system 10 that uses software emulation ofhardware bus lock. A first bus 20 connects first processors 22. A secondbus 30, operating in parallel with the first bus 20, connects secondprocessor 32. The first processors 22 may be of a first computerarchitecture. That is, the first processors 22 may operate in accordancewith an instruction set of the first computer architecture. The secondprocessor 32 may be of a second architecture, such as a legacyarchitecture, for example. Alternatively, the second processor 32 may beof the same architecture as the first processors 22. Furthermore,multiple second processors 32 may connect to the second bus 30. Finally,the first bus 20 and the second bus 30 may comprise multiple busses. Forexample, the first bus 20 may comprise an address portion, data portionand control portion. Further, there may be additional busses similar tothose shown or any other hierarchy or arrangement.

An input/output (I/O) chipset 40 is coupled to the first bus 20 and thesecond bus 30. The I/O chipset 40 performs many functions, such asreceiving interrupts generated by I/O devices (not shown) anddistributing them among the first processors 22 and the secondprocessor(s) 32. The I/O chipset 40 may also control access to mainmemory 55 and other computer system architectural features.

The main memory 55 may include one or more semaphores that are used inconjunction with the software emulation of hardware bus lock. Thesemaphores may be cacheable.

A fault handler 50 may be a software module operating on a CPU. Thefault handler 50 is shown in more detail in FIG. 2, and may include anemulation module 60 that provides for software emulation as analternative to hardware bus lock.

FIG. 3 illustrates a logic device 57 that receives a mode bit (or lockfault enable bit) 59 and a lock semantic indication (or request for abus lock) 61. The mode bit 59 is supplied from a configuration registerand the lock semantic indication 61 is supplied from an execution unit.The logic device 57 outputs a lock fault 63 to an exception unit.

In operation, the fault handler 50 may acquire a cacheable first, or buslock, semaphore 56 that is reserved for bus lock emulation purposes. Thebus lock semaphore 56 may be in a cache or in the main memory 55. Thatis, the bus lock semaphore 56 may be in a defined memory location thatis reserved, or set aside by firmware at, for example, boot up, so thatthe bus lock semaphore 56 is invisible to the computer system'soperating system. Then, only the emulation module 60 would have accessto the bus lock semaphore 56.

The fault handler 50 may also acquire a second, or page table semaphore58, or an equivalent address space protection mechanism, that is used toensure exclusive access to native page tables. That is, the computersystem 10 may have an address translation and protection mechanism thatis implemented by a page table in the main memory 55. The page table isused to cross-reference virtual addresses to physical addresses and toprotect memory pages from various types and privilege levels of access.The page table is protected by the page table semaphore 58. Therefore,before a processor can alter the page table, the processor firstacquires the page table semaphore 58.

Using emulation software 62 in the emulation module 60, the faulthandler 50 may invalidate relevant page table entries. This preventsother processors from looking up this locked region of main memory 55 inthe page table, and then accessing the locked region. In essence, thelocked region of the main memory 55 is hidden from other processors oragents in the computer system 10. The emulation software 62 in theemulation module 60 would then be used to purge any page table entriesin translation lookaside buffers (TLBs) (not shown in FIG. 1) thatcorrespond to the invalidated page table entries.

The fault handler 50 then sets a mode bit that defeats anyfault-on-lock-attempt behavior. Then, the fault handler 50 locallyinserts any needed translation or protection entries. That is, to accessthe locked memory region, translation protection entries (translationsbetween virtual and physical addresses) are written to the page table ora TLB, or both. The instruction sequence that caused the bus lock isthen executed, either by re-executing the same instruction sequence thatcaused the fault-on-bus-lock, or by executing a clone routine thatperforms the same operation, but without the locked semantic attached.Once the bus lock instruction sequence has been executed, the faulthandler 50 clears the mode bit and releases the first and the secondsemaphores. Processing in the computer system 10 then returns to thenormal execution flow.

Using the fault handler 50 and the emulation module 60, the scope of thebus lock mechanism can be reduced to a small address space, using pagingor a similar mechanism, for example. Another processor or agent is onlyaffected if that processor attempts to access the same address spaceduring the interval that the bus lock signal is asserted. That is, oncea fault is taken on a bus lock attempt, data related to the fault issaved and is observable by the fault handler 50. The saved data includesthe address or addresses that were attempted to be accessed. Thus thefault handler 50 has available the address or addresses that wereattempted to be accessed during a bus lock operation. This permits theemulation module 60 to remove access to only a page or two of memory,rather than remove access to all the computer system resources or to alarger region of the main memory 55.

Software emulation of hardware bus lock has been described above inrelation to access to memory. However, the same software emulationmechanism may be used in the case of a simultaneous access of an I/Odevice by more than one processor of the computer system 10. In thissituation, the I/O device appears to the processors as just anothermemory mapped entity, or region of memory.

In another situation, a processor and an I/O device may attempt toaccess the same memory region simultaneously. The I/O device may includea TLB or a paging-type mechanism, similar to that associated with theprocessor. In this case, operation of the emulation module 60 and thefault handler 50 would be the same as the situation in which multipleprocessors attempt to access the same memory region. If the I/O devicedoes not include a TLB or paging-type mechanism, then the fault handler50 may temporarily disable an I/O device or set of devices to preventthe I/O device from accessing the memory region.

The fault handler 50 and the emulation module 60 may be used withmultiple architecture processors, such as the processors 22 and 32.However, one or more processors might not operate in a paged orotherwise protected scheme. That is, one or more of the processors mightoperate in a real mode, or might access the main memory 55 in the realmode. In this alternative, a real-mode lock attempt may cause aninter-processor interrupt to be sent to other processors. The otherprocessors would then return an interrupt acknowledged signal before thereal-mode lock is asserted. The other processors would enter a waitstate and would be awakened by another interrupt when the lockedoperation was complete. However, this situation should occur onlyrarely.

The computer system 10 includes operating system (O/S) page tablemanagement code that is restricted from acquiring a bus lock itself.Otherwise, a first processor or agent could acquire the bus locksemaphore 56. Then, a second processor operating on the O/S page tablemanagement code could acquire the page table semaphore 58, and thenattempt to acquire the locked semaphore 56 using the emulation sequence.In this event, the first and the second processors deadlock becauseneither can acquire both the bus lock semaphore 56 and the page tablesemaphore 58.

As noted above, the software emulation of hardware bus lock mechanismmay be used with multiple computer architectures. An example is the useof software emulation with an IA-32 architecture and an IA 64architecture. In this situation, the IA-32 architecture may beimplemented using microcode. One feature of the microcoded IA-32architecture is single step trapping. A trap is an exception that isreported immediately following the execution of a trapping instruction.An exception is generated by a processor when the processor detectserrors during execution of an application program or the O/S code. Trapsallow execution of the application program to be continued without lossof program continuity. A return address for a trap handler points to theinstruction to be executed after the trapping instruction. The singlestep trapping allows the locking instruction sequence to be implementedby setting a flag for a trap handler to hand off to the fault handler50, clearing the single-step mode and re-executing the macroinstructionthat faulted.

As noted above, the software emulation mechanism may set a mode bit thatdefeats fault-on-lock-attempt behavior. In an alternative embodiment,the mode bit may be mode bit that distinguishes between hardware buslock and software emulation of hardware bus lock. In this embodiment, abus lock signal may be asserted by a processor, but would be ignored bythe computer system, such as by physical disconnection. Thus, theprocessor, apart from the computer system, may support both hardware buslock and software emulation of hardware bus lock. This allows theprocessor to be used in a computer system that supports hardware buslock only as well as in a computer system that supports softwareemulation of hardware bus lock.

Operation of the software emulation mechanism will now be described withreference to FIGS. 4-6, which are flowcharts of the processes executedto implement software emulation of hardware bus lock. In FIG. 4, theprocess starts at block 100 with a processor attempting a bus lockoperation, for example to access a desired memory region. The faulthandler 50 acquires the bus lock semaphore 56 and provides the bus locksemaphore to the processor, such as one of the processors 22, thatattempted the bus lock, block 110. The bus lock semaphore 56 may becached in a cache associated with the processor 22. The fault handler 50next acquires the page table semaphore 58, which may also be cached,block 120. The fault handler 50 then invalidates any relevant page tableentries, block 130, thereby preventing access to the desired memoryregion. The fault handler 50 also purges any TLB pages associated withthe desired memory region, block 140.

To execute the bus lock, the fault handler 50 sets a mode bit thatdefeats a fault-on-lock attempt of the processor, block 150, and locallyinserts any needed translation/protection entries, block 160. Next, thesingle step or break point is set, block 165, and the process jumps tothe original lockup instruction, block 170. The process then moves tobreak point, block 175.

FIG. 5 illustrates the process for the single step break point handlerroutine. The process starts at block 177, and then (block 180) the faulthandler 50 clears the mode bit set at block 150 (see FIG. 4). Finally,the processor clears the single step trap (block 185) and releases thebus lock semaphore 56 and the page table semaphore 58, processing in thecomputer system returns to normal execution, block 200, and the processends, block 210.

FIG. 6 illustrates an alternative process for software emulation ofhardware bus lock in which a clone instruction is executed. In FIG. 6,process blocks 220-260 correspond to process blocks 100-140 of FIG. 4.In block 270, the local translation entries are inserted. In block 280,the clone instruction sequence is executed, without lock semantics. Theremaining blocks 290-310 correspond to process blocks 190-210 of FIG. 4.

The terms and descriptions used herein are set forth by way ofillustration only and are not meant as limitations. Those skilled in theart will recognize that many variations are possible within the spiritand scope of the invention as defined in the following claims, and theirequivalents, in which all terms are to be understood in their broadestpossible sense unless otherwise indicated.

1. An apparatus, implemented in a computer system, for emulating ahardware bus lock, the computer system comprising a central processingunit (CPU) executing an operating system (O/S), processors, and a mainmemory, the apparatus comprising: the main memory accessible by the CPUand the processors, the main memory comprising a reserved memory blockand a page table, wherein the reserved memory block stores a bus locksemaphore and a page table semaphore, and wherein page tables crossreference virtual addresses and physical addresses; and a fault handlermodule operating on the CPU, the fault handler module including anemulation module wherein the fault handler receives and indication of anattempted operation capable of causing the hardware bus lock, whereinthe operation comprises accessing a desired location in the computersystem and an attempted simultaneous access to an input/output (I/O)device.
 2. The apparatus of claim 1, wherein the bus lock semaphore isinvisible to the O/S.
 3. The apparatus of claim 1, wherein exclusiveaccess to native page tables is maintained.
 4. The apparatus of claim 1,wherein the emulation module comprises: means for purging page tableentries in TLBs that correspond to invalidated page table entries; meansfor defeating a fault-on-lock attempt behavior of a processor.
 5. Theapparatus of claim 2, wherein the means for defeating the fault-on-lockattempt behavior is a mode bit set by the fault handler.
 6. Theapparatus of claim 3, wherein the mode bit distinguishes betweenhardware bus lock and software emulation of hardware bus lock.
 7. Amethod, executed in a computer system including processors, translationlookaside buffers, translation lookaside buffers (TLBs) for translatingbetween virtual addresses and physical addresses, and a memory, thememory including page tables for preventing a hardware bus lock bysoftware emulation, the method comprising: at a device in the computersystem, attempting an operation capable of causing the hardware buslock, wherein the operation comprises accessing a desired location inthe computer system and attempted simultaneous access to an input/output(I/O) device; at a fault handler, receiving an indication of theattempted operation and acquiring a bus lock semaphore; caching the buslock semaphore at a cache associated with the device; at the faulthandler: invalidating selected page table entries, wherein access to thedesired location is prevented, and purging any TLB page tablesassociated with the desired location; and executing an instructionsequence to cause a fault-on-the-bus-lock. 8 The method of claim 7,wherein executing the instruction sequence comprises re-executing thesame instruction sequence as the attempted operation.
 9. The method ofclaim 7, wherein executing the instruction sequence comprises executinga clone routine that performs the same operation as the attemptedoperation without a lock semantic.
 10. The method of claim 7, whereinthe device is a processor.
 11. The method of claim 10, wherein theprocessor accesses the main memory in a real mode, the method furthercomprising: asserting an inter-processor interrupt; and sending theinterrupt to other processors in the computer system, the otherprocessors entering a wait state in response to the interrupt, the otherprocessors waking upon completion of the locked operation.
 12. A methodfor software emulation of hardware bus lock in a computer system,comprising: acquiring a bus lock semaphore; providing the bus locksemaphore to a device attempting a bus lock operation; acquiring a pagetable semaphore; invalidating page table entries to prevent access to alocation in the computer system referenced by the bus lock operation,wherein the location is an input/output (I/O) device; purgingtranslation lookaside buffer pages associated with the location; andexecuting a clone instruction without lock semantics.
 13. The method ofclaim 12, further comprising inserting local translation entries
 14. Themethod of claim 12, further comprising releasing the bus lock and pagetable semaphores.
 15. The method of claim 12, wherein the bus lockoperation is attempted by a processor operating in a real mode, themethod further comprising asserting an inter-processor interrupt.